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Remarks 

Claims 1-15 and 17-28 are currently pending in the subject application and are presently 
under consideration. Claims 19 and 24 have been amended as shown on pages 2-5 of the Reply. 
Applicants' representative thanks the Examiner for the telephone interview of May 27, 2008 
wherein merits of the claims vis-a-vis the cited documents were discussed. 

Favorable reconsideration of the subject patent application is respectfully requested in 
view of the comments and amendments herein. 

I. Objection of Claim 19 

Claim 19 is objected to because of minor informalities. Withdrawal of this objection is 
requested for the following reasons. As claim 19 has been amended to cure the minor 
informalities pointed out by the Examiner this objection should be withdrawn. 

II. Rejection of Claim 19 Under 35 U.S.C §112 

Claim 19 stands rejected under 35 U.S.C §1 12, first paragraph, as failing to comply with 
the written description requirement. Withdrawal of this rejection is requested for the following 
reasons. Claim 19 has been amended to cure the minor informalities pointed out by the 
Examiner. Hence, withdrawal of this objection is respectfully requested. 

III. Rejection of Claims 24-25 Under 35 U.S.C. §102(e) 

Claims 24-25 stand rejected under 35 U.S.C. § 102(e) as being anticipated by Evans (US 
2004/0030683). Withdrawal of this rejection is requested for the following reasons. Evans fails 
to disclose or suggest all features set forth in the subject claims. 

A single prior art reference anticipates a patent claim only if it 
expressly or inherently describes each and every limitation set 
forth in the patent claim. Trintec Industries, Inc. v. Top-U.S.A. 
Corp., 295 F.3d 1292, 63 USPQ2d 1597 (Fed. Cir. 2002); See 
Verdegaal Bros. v. Union Oil Co. of California, 814 F.2d 628, 631, 
2 USPQ2d 1051, 1053 (Fed. Cir. 1987). The identical invention 
must be shown in as complete detail as is contained in the ... 
claim. Richardson v. Suzuki Motor Co., 868 F.2d 1226, 9 USPQ2d 
1913, 1920 (Fed. Cir. 1989). 
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Applicants' claimed subject matter provides for a system and method of facilitating 
incremental web crawls using chunks. Information gathered from a web crawl is indexed and 
chunked based on similar properties like average time between change and average importance 
of the retrieved documents. This chunk map then is employed to determine which chunks 
should be re-crawled. To this end, amended independent claim 24 recites a data packet 
transmitted between two or more computer components that facilitates document re-crawl, the 
data packet comprising: a chunk header that includes metadata associated with the data 
packet, wherein a chunk comprises document files that have similar properties; an offset 
section that provides offset information associated with document files; and the document files 
that include content found on the Internet, wherein the average of the at least one of the 
properties of all the document files determines if the document should be re-crawled. Evans is 
silent regarding such novel features recited by the subject claims. 

Evans relates to a system and process for mediated crawling. At the cited portions, Evans 
discloses a system for retrieving network based content, including media files and data related to 
media files, on a computer network via a search system utilizing metadata. When a media file is 
transmitted, the metadata of that file is transmitted along with it. In contrast, the claimed 
invention allows for information of documents gathered from a web crawl to be indexed and 
chunked based on similar properties like average time between change and average importance 
of the retrieved documents. Each chunk comprises a header that includes metadata that is 
transmitted, wherein each chunk comprises document files that have similar properties. This 
allows for transmitting metadata that reflects the average of the properties of all the documents in 
that chunk. Depending on the metadata, it is determined if all the documents in that chunk 
should be recrawled. Thus, Evans is silent regarding a chunk header that includes metadata 
associated with the data packet, wherein a chunk comprises document files that have similar 
properties, the average of the at least one of the properties of all the document files determines 
if the document should be re-crawled as recited by the subject claim. 

From the foregoing, it is clear that an identical invention as recited in the subject claims 
is not taught or suggested by Evans. Accordingly, it is requested that this rejection with respect 
to independent claim 24 (and the claims that depend there from) should be withdrawn. 
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IV. Rejection of Claims 1-4, 6-15, 17, 19-23 and 26-28 Under 35 U.S.C. §103(a) 

Claims 1-4, 6-15, 17, 19-23 and 26-28 stand rejected under 35 U.S.C. § 103(a) as being 
unpatentable over Najork, et al. (US 6,263,364) in view of Evans. Withdrawal of this rejection 
is requested for the following reasons. Najork et al. and Evans, alone or in combination, fail to 
teach or suggest all aspects set forth in the subject claims. 

Applicants' claimed subject matter provides for a system and method of facilitating 
incremental web crawls using chunks. Information gathered from a web crawl is indexed and 
chunked based on similar properties like average time between change and average importance 
of the retrieved documents. This chunk map then is employed to determine which chunks 
should be re-crawled. To this end, independent claim 1 recites a system that facilitates 
incremental web crawls comprising: an indexer that places items with similar properties into 
respective chunks; and a chunk map that stores at least some of the properties associated with 
the respective chunk, wherein the properties are at least one of average time between change or 
average importance of documents comprising a particular chunk, the chunk map employed to 
facilitate an incremental web re-crawl. Independent claim 26 recites similar features. Najork et 
al. and Evans are silent regarding such novel features. 

Najork et al relates to a web crawler that downloads documents from a plurality of host 
computers and enqueues document addresses, where each queue has documents sharing a 
common host component. At page 5 of the Final Office Action, the Examiner concedes that 
Najork et al. does not teach storing the properties associated with a respective chunk in a chunk 
map, the chunk map employed to facilitate an incremental web crawl. The Examiner cites Evans 
to cure the aforementioned deficiencies of Najork et al. 

Evans relates to a system and process for mediated crawling. At the cited portions, Evans 
discloses storing auxiliary information pertaining to the encountered web sites in a database. 
This stored information is employed to determine how often to recrawl. Thus, the system stores 
the properties of each and every document in a database. In contrast, the claimed invention 
allows for chunking, wherein a set of documents that have similar properties are placed in a 
chunk. Each chunk and its properties are stored in a chunk map and this allows for the chunk to 
be manipulated as one set. Each document does not have to be analyzed, but the whole chunk 
can be recrawled depending on its properties. Thus, Evans is silent regarding a chunk map that 
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stores at least some of the properties associated with the respective chunk, and the chunk map 
employed to facilitate an incremental web re-crawl as recited by independent claims 1 and 26. 

Independent claim 19 recites a method of performing document re-crawl comprising: 
parsing a first chunk for uniform resource locators, wherein a chunk map that stores properties 
associated with the respective chunk is employed to determine the first chunk; re-crawling the 
uniform resource locators; and forming a second chunk separate from the first chunk, based at 
least in part, upon the re-crawled uniform resource locators. Najork et al. and Evans are silent 
regarding such novel features. At the cited portions, Najork et al. discloses performing a recrawl 
of a queue and enqueueing any new URL's into the front-end queue or in another queue 
depending on host identifier of the new URL. In contrast, the claimed invention allows for 
forming a second chunk, separate from the first chunk, if the indexer determines that the 
document belonging to the new URL does not belong to an existing chunk. Thus Najork et al. is 
silent regarding forming a second chunk separate from the first chunk, based at least in part, 
upon the re-crawled uniform resource locators as recited by independent claim 19. Evans does 
not compensate for the aforementioned deficiency of Najork et al. 

In view of at least the foregoing it is readily apparent that Najork et al. and Evans, either 
alone or in combination do not teach or suggest each and every element set forth in the 
applicants' subject claims. Accordingly it is requested that this rejection should be withdrawn. 

V. Rejection of Claim 5 Under 35 U.S.C. § 103(a) 

Claim 5 stands rejected under 35 U.S.C. § 103(a) as being unpatentable over Najork, et al. 
in view of Evans and further in view of Eichstaedt, et al. (US 6,182,085). It is respectfully 
requested that this rejection be withdrawn for at least the following reasons. Claim 5 depends 
from independent claim 1, and as discussed supra, Najork et al. and Evans, alone or in 
combination, do not disclose all features of independent claim 1. Eichstaedt et al. relates to large 
scale information gathering utilizing collaborative team crawling, and does not make up for the 
aforementioned deficiencies with respect to independent claim 1. Thus, the subject invention as 
recited in the subject claims is not obvious over the combination of Najork et al., Evans and 
Eichstaedt et al. Accordingly, it is respectfully submitted that this rejection of claim 5 should be 
withdrawn. 
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VI. Rejection of Claim 18 Under 35 U.S.C. §103(a) 

Claim 18 stands rejected under 35 U.S.C. § 103(a) as being unpatentable over Najork, et 
al. in view of Evans and further in view of Acharaya, et al. (US 2007/0094255). It is 
respectfully requested that this rejection be withdrawn for at least the following reasons. Claim 
18 depends from independent claim 1, and as discussed supra, Najork et al. and Evans, alone or 
in combination, do not disclose all features of independent claim 1 . Acharaya et al. relates to 
document scoring based on link-based criteria, and does not make up for the aforementioned 
deficiencies with respect to independent claim 1. Thus, the subject invention as recited in the 
subject claims is not obvious over the combination of Najork et al., Evans and Acharaya et al. 
Accordingly, it is respectfully submitted that this rejection of claim 5 should be withdrawn. 

Conclusion 

The present application is believed to be in condition for allowance in view of the above 
comments and amendments. A prompt action to such end is earnestly solicited. 

In the event any fees are due in connection with this document, the Commissioner is 
authorized to charge those fees to Deposit Account No. 50-1063 [MSFTP51 1US]. 

Should the Examiner believe a telephone interview would be helpful to expedite 
favorable prosecution, the Examiner is invited to contact applicants' undersigned representative 
at the telephone number below. 

Respectfully submitted, 
Amin, Turocy & Calvin, llp 

/Himanshu S. Amin/ 

Himanshu S. Amin 
Reg. No. 40,894 



Amin, Turocy & Calvin, llp 
24 th Floor, National City Center 
1900 E. 9 th Street 
Cleveland, Ohio 44114 
Telephone (216) 696-8730 
Facsimile (216) 696-8731 
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